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Abstract. - The ingenious idea of single molecule imaging by hard x-ray Free Electron 
Laser (X-FEL) pulses was recently proposed by Neutze et al. pp. However, in their numerical 
modelling of the Coulomb explosion several interactions were neglected and no reconstruction 
of the atomic structure was given. In this work we carried out improved molecular dynamics 
calculations including all quantum processes which affect the explosion. Based on this time 
evolution we generated composite elastic scattering patterns, and by using Fienup's algorithm 
successfully reconstructed the original atomic structure. The critical evaluation of these results 
gives guidelines and sets important conditions for future experiments aiming single molecule 
, structure solution. 
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. Introduction. Structure solution without crystals was not conceivable for nearly a 

century. However, Linac based X-FEL-s will soon become operational and their extremely 
short and intense pulses offer the chance for a new type of experiment. The concept of single 
molecule imaging is based on fast measurement of a "slowly" exploding system. In the 
proposed experiment the molecule does not have time to deteriorate during a single x-ray 
pulse and enough elastically scattered photons can be collected to give information on the 
unmodified structure. There are two distinct theoretical parts of the problem: the Coulomb 
explosion and the image reconstruction process. In a previous paper we have given a detailed 
description of the Coulomb explosion [2]. In the present work we analyse the effect of the 
explosion on the image reconstruction. 

Up to now only two papers have been published on the Coulomb explosion initiated by 
hard x-ray pulses. In the first, the explosion of a lysozime molecule has been modelled pQ. In 
the second, 100-1500 atom carbon clusters exploded under the influence of a single pulse 
These works are special molecular dynamics calculations of charged particles, which take into 
account various quantum processes initiated by the photoeffect. From this point of view, the 
second set of calculations is closer to reality, because it handles all processes which change 
the dynamics of the system. Therefore, our treatment of Coulomb explosion is based on this 
work. 



(*) E-mail: gf9szfki.hu 

© EDP Sciences 



2 



EUROPHYSICS LETTERS 



Image reconstruction from intensity data - also known as the phase retrieval in optics - has 
a more extended literature. Over years of practice Fienup's hybrid input-output method 3,4 
proved to be the most successful algorithm of phase retrieval. It is based on a no density region 
surrounding a non-periodic object in real space and a corresponding densely sampled data in 
reciprocal space. Those with a crystallographic background also use the term oversampling 
for this scenario, meaning that it uses more data than conventional Bragg sampling of crystals 
would allow UJ. While oversampling is not applicable in the case of crystals, it is well suited 
for single molecule imaging. A recent paper already applied this method for the reconstruction 
of 3D synthetic data of a static macromolecule UJ. 

None of the earlier papers treated the Coulomb explosion and the reconstruction process 
together, which we shall do in the present work. We also discuss the fundamental questions: 
how the atom density affects the Coulomb explosion and reconstruction, what time-window of 
a single pulse is available for useful data collection and finally how multi-pulse measurements 
affect reconstruction. The answers are crucial for planning an experiment with the target of 
single molecule structure solution. 

Coulomb explosion of the cluster. - Atomic resolution imaging is important in many fields 
of science, from solid state physics to biology. This diverse interest covers cluster sizes ranging 
from a few tens to millions of atoms. The original proposal of single molecule imaging came 
from biology but biological macromolecules are too large to accurately model their Coulomb 
explosion. Therefore, we follow the time evolution of smaller atomic clusters. We expect that 
the results can be plausibly extended to larger systems and the general conclusions are also 
relevant to macromolecules. 

Our model system is a 200-atom carbon cluster forming an incomplete cube, as shown in 
fig. n Atoms are scattered around grid points of a simple cubic lattice and are held together 
by central forces. Three different lattice spacings were investigated: a=1.5, 2.4 and 3.0 A. The 
depth of the bonding potential corresponds to the covalent bond for the 1.5 A lattice, to the 
van der Waals bond for the 3.0 A lattice and it is linearly interpolated for the intermediate 
2.4 A lattice. With this choice of parameters we can study the characteristics of the explosion 
for a wide range of atom density and bond strength. 

The calculation follows the time evolution of the system while a gaussian shape x-ray 
pulse is incident on the sample. We solve the classical equations of motion for all the particles 
(atoms, electrons and ions) as described in 0. Quantum processes are taken into account as 
random events by probabilities corresponding to their cross sections. We include the photoef- 
fect, Auger processes, Coulomb interaction and inelastic electron scattering. The parameters 
of the pulse are: 10 keV photon energy, 5 ■ 10 12 integrated photon number and 10 fs full width 
at half maximum. Note that the pulse width is shorter than initially expected at hard x-ray 
FEL-s j3|S]. However, current efforts to decrease the pulse width below 100 fs are promising 
and reaching 10 fs is not unrealistic HJ. The reason for this choice is that the shorter the pulse 
the higher the ratio of photons available for useful imaging. 

To illustrate the changes of the atomic configuration during a pulse, the time evolution of a 
typical Coulomb explosion of the a=1.5 A cluster is shown in fig. El The total period of model 
calculation is between t=0 and 30 fs where 15 fs corresponds to the maximum of the gaussian 
shape pulse. It is clear that up to 10 fs the cluster is almost unchanged. Visible distortions 
start to develop at 12 fs and by 18 fs the cluster has totally disintegrated. To quantify the 
explosion we calculated the cluster charge and the mean displacement as a function of time 
for the three atom densities (see fig. |3J|. The total charge shows how many electrons scatter 
elastically while the mean displacement gives an estimate for the distortion of geometry. The 
trend in both series of curves indicates that the explosion takes longer for low atom densities. 
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This is in agreement with natural expectations, the Coulomb repulsion is smaller for ions at 
larger distances. We can learn two important facts from figs. [21 and [3J (i) useful data can be 
collected only in the first part of the pulse, (ii) low density systems will allow more time for 
data collection. 

However, no cleverly chosen parameter can replace the real reconstruction process. During 
the explosion the elastic scattering pattern changes significantly and what we measure is a 
composite pattern. It is not trivial that using this composite pattern as input data, the 
original atomic configuration can be reconstructed. 

The reconstruction algorithm. - In this work we used a modification of Fienup's hybrid 
input-output algorithm. In real space the charge density is represented in a cubic box with 
cell edge L and grid spacing AL. The box size L must be larger than the object and the grid 
spacing AL must be sufficiently fine to reconstruct atoms. We used L=25, 35 and 40 A for 
the three different atom densities while AL=0.4 A was kept constant. The real space charge 
density and the reciprocal space scattering amplitudes are related by the Discrete Fourier 
Transform. Thus amplitudes in reciprocal space are also represented in a cubic box with a 
cell edge 1/ AL and grid spacing 1/L. Measurement of elastic scattering is required at points 
of this dense grid. The reciprocal space box limits the maximum momentum transfer at 
Qmax = tt/AL. We use all data within the sphere of radius q max and treat unobserved data 
outside the sphere as zeros. 

The algorithm requires a molecular support, which completely surrounds the object. The 
object is smaller than its support and the charge density is positive - these are the simple a pri- 
ori constraints. We used a spherical volume, which is a loose support, it poorly approximates 
the shape of the cluster. The volume ratio Vbox/^upport is called the oversampling parameter, 
denoted by a. For the three atom densities we chose the radius 10, 14 and 16 A, corresponding 
to a = 3.73 for which the algorithm is expected to work. Fienup's method is a special type 
of density modification, it cycles between real and reciprocal space by the Fourier transform. 
In real space the charge density is modified only where the support or positivity constraints 
are violated, while in reciprocal space the observed moduli and model phases are combined 
without a condition. The q — reciprocal space amplitude corresponds to the total charge 
but is unobserved. We chose the correct ab initio treatment, it is initialized to zero and then 
allowed to change freely. The iteration starts with a random phase set in reciprocal space. 
Testing multiple phase sets is important, some phase sets converge faster, while others do not 
converge within a fixed number of iterations. There is a single parameter (3 in the real part 
of the algorithm, which acts as feedback. In the literature the typical value of (3 is 0.5-0.9, we 
used (3 = 0.7. Our experience with the original algorithm is that after a number of iterations 
large negative charge density develops and oscillates which is a sign of over-amplification. 
The solution was to set the charge density outside the support to zero before each real space 
modification. This makes convergence slower but the quality of reconstruction is better. To 
evaluate a solution it is best both to plot the charge density and by peak picking analyse the 
number and position of atoms. 

Our model cluster is a non-centrosymmetric object with pseudo-symmetries, its atomic 
resolution reconstruction is far from trivial. We carefully tested the algorithm by reconstruct- 
ing static configurations of the exploding cluster at different times up to 20 fs. We found that 
any static configuration can be reconstructed in this time interval, so only the quality of real 
data will limit the structure solution. 

3D data and multi-pulse measurements. - The reconstruction algorithm needs 3D data 
while in a single-pulse experiment only a 2D slice can be collected. Therefore, even at ideal 
conditions, we cannot measure the full data set during a single pulse. We have to merge the 
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results of many repeated measurements, which are made on identical replicas of the molecule 
in different orientations. Here we give an estimate for the minimum number of independent 
orientations and the total number of pulses necessary for a complete data set from a small 
biological molecule. 

Let us take a typical macromolecule which is enclosed in the real space box with edge 
L=60 A and require reconstruction on a grid with AL=0.4 A spacing. Then reciprocal space 
data is represented in a box with edge 1 / AL and grid spacing l/L. At the maximum resolution 
data is accessible only inside the limiting sphere with radius R = 1/(2 AL). The scattering 
intensity is sampled with (l/L) 3 density and 4ir/3 ■ R 3 ■ L 3 = 1.8 • 10 6 is the total number 
of 3D data points. A single orientation provides 2D data on the surface of the Ewald sphere 
with radius R/2. Assuming that each (l/L) 2 area of the 4tt(R/2) 2 surface gives independent 
information, a single orientation contributes 1.8 • 10 4 points to the 3D data. So in this example 
the number of independent orientations is 100. For larger molecules this number increases 
with the linear size of the box. 

This is the minimum number of orientations, which must be measured with adequate 
statistics. Repeated measurements are needed both to cover all required orientations and also 
to improve their statistics. Sorting individual single-pulse patterns into orientation bins is 
not discussed here but will need special attention. Let us suppose there is no background, 
and the reconstruction algorithm can tolerate 10% noise. Considering that x- ray scattering 
is strongly anisotropic we estimate that on average 100 elastically scattered photons per pixel 
must be collected. For the above example this is the order of 10 8 photons/orientation and 10 10 
photons total. The elastic cross section of carbon and the intensity of 5 • 10 12 photons/pulse 
lead to about 5 • 10 3 elastically scattered photons per pulse. Therefore, to construct 3D data 
with adequate statistics approximately 2 • 10 6 repeated measurements must be done using the 
same number of replicas of the molecule. 

Time integral and gating. - The Coulomb explosion fundamentally determines the qual- 
ity of data, which is available for structure solution. During data collection the scattering 
pattern changes drastically. Even a 2D section of data measured by a single pulse is a com- 
posite pattern, a time integral weighted by the pulse intensity. 

We need the optimum integration time with the highest number of elastically scattered 
photons and a pattern closest to that of the original structure. If the time- window is too short 
then the statistics of individual patterns is poor. If the time-window is too long then the 
scattering of the disintegrated cluster becomes dominant, washing out the original structural 
information. For the sake of data quality part of the pulse must be thrown away and because 
the total number of photons is fixed, more measurements are needed for the same statistics. 
We will show that with the pulse width of 10 fs the increase in the number of measurements 
is not drastic, a factor of 2 to 4. The situation could be worse with the pulse width of 100 fs 
because the same level of distortion is reached much before the peak, which means an order 
of magnitude less photons for reconstruction. 

Here we give an estimate for the optimum integration time using 10 fs pulses. For simplicity 
we assume that 3D data can be obtained from a single experiment, as if separate explosions 
took place the same way. Integration is approximated by summing the scattering pattern 
at each 0.5 fs time-slice with the weight of the gaussian pulse intensity. The upper limit of 
integration is i max - Statistical uncertainty of real experiments is simulated by adding 10% 
noise to the composite pattern. Then we make a reconstruction for each composite pattern 
and compare it to the original charge density. We accept the reconstruction for i max if all 
the atoms are found and the standard deviation of displacement does not exceed 0.3 A. With 
these requirements the structure of the a=1.5, 2.4 and 3.0 A cluster was successfully recovered 
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up to i max =12, 14 and 15 fs respectively. These time-windows correspond to 25, 40 and 50% 
useful photons per pulse [TU| . 

Collecting data for longer periods degrades the quality of the reconstruction, the useful 
structural information is quickly washed out. It is informative to follow the scattering pattern 
of individual time-slices. What happens is that the disintegrating cluster becomes random 
in real space, so its Fourier transform turns into white noise. Fig. 01 shows the quality of 
reconstruction for the a=1.5 A cluster using integrated data up to i max =12 and 18 fs. While 
with shorter integration we get a structure practically identical to the original cluster, just 6 fs 
longer integration leads to the loss of half of the atoms and to increased structural disorder. 
Similar differences were found in the case of clusters with larger interatomic distances. The 
summary of the above results is: accurate gating will be critical for a real experiment. 

The effect of independent replicas. - After giving an estimate for the optimum integration 
time of single pulses, we consider the effect of averaging in multi-pulse experiments. Although 
in principle identical replicas of the molecule are used in consecutive experiments, their disinte- 
gration will follow different pathways. This is partly due to thermal vibration which displaces 
the starting atomic positions, and partly to the stochastic nature of photo-ionization, Auger 
process and secondary ionization which changes the dynamics of the Coulomb explosion. 

To check the joint effect of the time integral and multi-pulse averaging on structure so- 
lution, we followed 100 explosions of the et=1.5 A cluster. The deviation of the starting 
configurations was chosen to be small, corresponding to low temperature (T=30 K). Even 
so, the time evolution leads to differences in the atomic position and charge state which is 
already apparent at i max =12 fs. The scattering pattern was calculated as before, but the time 
integral was added for all individual explosions. We found that the quality of reconstruction 
is somewhat better than for the single-pulse case. This is in accordance with expectations, 
averaging the contribution of many slightly different configurations results in a smaller devia- 
tion from the original structure. The improvement is not drastic, the useful integration time 
can be increased just by a femtosecond. 

Conclusion and outlook. - In this work we have shown that single molecule imaging using 
hard X-FEL pulses might be possible, but the experimental difficulties are more severe than 
anticipated earlier. Our results differ from those of previous publications in several respects. 
First, the time evolution of the Coulomb explosion is more reliable as we included all quantum 
processes in the simulation. Second, we time integrated the scattering pattern of the exploding 
cluster and took into account the effect of multiple replicas. Reconstruction of the original 
atomic structure was achieved using these composite patterns at 0.8 A resolution. This is in 
contrast to previous work, which reconstructed a static structure at 2.5 A resolution [Hj. 

While giving a unified treatment of the Coulomb explosion and reconstruction, we arrived 
at new conclusions: (i) We demonstrated that only part of the pulses can be efficiently used 
for data collection, therefore very fast gating will be required, (ii) We analysed the recon- 
struction as a function of atom density, and found that lower density will allow more time for 
data collection. In terms of the fraction of total photons available for imaging this means a 
maximum of 50%. (iii) We also showed that averaging the contribution of individual replicas 
in a multi-pulse experiment slightly improves the quality of reconstruction. 

Finally, we would like to call attention to problems not discussed in the paper, but impor- 
tant for the feasibility of real experiments. In practice a complete data set can be collected by 
measuring a large number of identical replicas, and individual scattering patterns have to be 
arranged into 3D data on a reciprocal space grid. Due to the low statistics of single patterns 
it will be difficult to sort them into distinct orientation bins. Although electron microscopy 
successfully handles a similar problem |llj , in the case of X-FEL experiments it remains to 
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Fig. 1 - Model system of the 200 atom carbon cluster forming an incomplete cube. Only the most 
dense variant with 1.5 A interatomic distances is shown. 

be solved. An other serious problem is the low signal to noise ratio. As energy selective 
detectors are useless at the femtosecond timescale, the detector will count all the photons in 
the experiment. As the elastic scattering is weak, any spurious scattering will distort useful 
structural information. It is difficult to estimate the background level, but Compton scatter- 
ing will certainly give a contribution. Its weight relative to elastic scattering is small while 
the electrons are localized on the atoms, but will drastically increase after photoionization. 
Of course, background will also come from instrumental sources such as the beam path and 
sample environment. Here we pointed out just a few experimental problems, probably there 
will be several others. However, the importance of single molecule imaging is so great that 
once it is shown to work given ideal conditions, it is worth the effort to realize the experiment 
and attempt the solution of these problems. 

* * * 
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Fig. 2 - Time evolution of a typical Coulomb explosion of the o=1.5 A cluster. The four snapshots 
are taken at t=10, 12, 15 and 18 fs, where t=15 fs corresponds to the maximum of the x-ray pulse. 
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Fig. 3 - Time evolution of the cluster charge and mean displacement for the three atom densities. 
The charge is normalized to 1, the mean displacement is scaled by 1/a. The symbols circle, square 
and diamond correspond to nearest atom distances a— 1.5, 2.4 and 3.0 A respectively. 



Fig. 4 - Reconstructed charge density of the a=1.5 A cluster using integrated data up to t max =12 
and 18 fs. The difference of quality is obvious. 
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